south asian
Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs
Saeed, Muhammed, Abdul-mageed, Muhammad, Shehata, Shady
Large language models (LLMs) are widely deployed for open-ended communication, yet most bias evaluations still rely on English, classification-style tasks. We introduce DebateBias-8K, a new multilingual, debate-style benchmark designed to reveal how narrative bias appears in realistic generative settings. Our dataset includes 8,400 structured debate prompts spanning four sensitive domains: women's rights, socioeconomic development, terrorism, and religion, across seven languages ranging from high-resource (English, Chinese) to low-resource (Swahili, Nigerian Pidgin). Using four flagship models (GPT-4o, Claude 3, DeepSeek, and LLaMA 3), we generate and automatically classify over 100,000 responses. Results show that all models reproduce entrenched stereotypes despite safety alignment: Arabs are overwhelmingly linked to terrorism and religion (>=95%), Africans to socioeconomic "backwardness" (up to <=77%), and Western groups are consistently framed as modern or progressive. Biases grow sharply in lower-resource languages, revealing that alignment trained primarily in English does not generalize globally. Our findings highlight a persistent divide in multilingual fairness: current alignment methods reduce explicit toxicity but fail to prevent biased outputs in open-ended contexts. We release our DebateBias-8K benchmark and analysis framework to support the next generation of multilingual bias evaluation and safer, culturally inclusive model alignment.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (6 more...)
- Law (0.89)
- Law Enforcement & Public Safety > Terrorism (0.69)
Cultural Awareness in Vision-Language Models: A Cross-Country Exploration
Madasu, Avinash, Lal, Vasudev, Howard, Phillip
Vision-Language Models (VLMs) are increasingly deployed in diverse cultural contexts, yet their internal biases remain poorly understood. In this work, we propose a novel framework to systematically evaluate how VLMs encode cultural differences and biases related to race, gender, and physical traits across countries. We introduce three retrieval-based tasks: (1) Race to Country retrieval, which examines the association between individuals from specific racial groups (East Asian, White, Middle Eastern, Latino, South Asian, and Black) and different countries; (2) Personal Traits to Country retrieval, where images are paired with trait-based prompts (e.g., Smart, Honest, Criminal, Violent) to investigate potential stereotypical associations; and (3) Physical Characteristics to Country retrieval, focusing on visual attributes like skinny, young, obese, and old to explore how physical appearances are culturally linked to nations. Our findings reveal persistent biases in VLMs, highlighting how visual representations may inadvertently reinforce societal stereotypes.
- Asia > Middle East > UAE (0.19)
- North America > United States (0.18)
- Africa > Democratic Republic of the Congo (0.15)
- (52 more...)
Collaborative Learning From Distributed Data With Differentially Private Synthetic Twin Data
Prediger, Lukas, Jälkö, Joonas, Honkela, Antti, Kaski, Samuel
Consider a setting where multiple parties holding sensitive data aim to collaboratively learn population level statistics, but pooling the sensitive data sets is not possible. We propose a framework in which each party shares a differentially private synthetic twin of their data. We study the feasibility of combining such synthetic twin data sets for collaborative learning on real-world health data from the UK Biobank. We discover that parties engaging in the collaborative learning via shared synthetic data obtain more accurate estimates of target statistics compared to using only their local data. This finding extends to the difficult case of small heterogeneous data sets. Furthermore, the more parties participate, the larger and more consistent the improvements become. Finally, we find that data sharing can especially help parties whose data contain underrepresented groups to perform better-adjusted analysis for said groups. Based on our results we conclude that sharing of synthetic twins is a viable method for enabling learning from sensitive data without violating privacy constraints even if individual data sets are small or do not represent the overall population well. The setting of distributed sensitive data is often a bottleneck in biomedical research, which our study shows can be alleviated with privacy-preserving collaborative learning methods.
- Europe > United Kingdom > England (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > New York (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Column One: In their search for love, South Asians swipe right on dating apps catered for them
Most swiping for love on a dating app know the drill. Perhaps declare intentions: Looking for something serious? The dating app Mirchi presents another possibility: "Auntie made me sign up." The option is part joke, part knowing nod to its audience. Unlike the mainstream apps such as Tinder or Bumble, Mirchi is among the growing world of dating apps created by and catering to South Asians.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.06)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- (9 more...)
Artificial intelligence could be key to detecting diabetes and heart disease among South Asians
According to a study by the NYU Center for the Study of Asian American Health (CSAAH), diabetes, which increases the risk of heart diseases, is rife among South Asians. The report's findings conclude that in the United States, South Asian immigrants are 7 times more likely to have type 2 diabetes than the general population, and in New York City, Indian immigrants are at a greater risk of hospitalization for diabetes than other immigrants.
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)